Superposed speech localisation using frequency tracking

نویسندگان

Maxime Le Coz

Julien Pinquier

Régine André-Obrecht

چکیده

On this paper we present a new approach for the localisation of superposed speech areas. The system is based on the frequency tracking of speech segments following the evolution of the main amplitude frequencies and uses no learning of acoustic or prosodic models. The set of trackings of the frequencies are then grouped together using a distance based on the harmonicity, each group being the production of a single speaker. The co-occurrence of different harmonic groups is then used as a consequence of the presence of multiple speakers. Our method has been evaluated on the data of the French ANR evaluation campaign ETAPE, showing the usability of this approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...

متن کامل

Particle Filtering Methods for Acoustic Source Localisation and Tracking

The task of acoustic source tracking plays an important role in many practical speech acquisition systems. This research presents an extensive study of sequential Monte Carlo methods applied to the source localisation problem, based on the signals received at an array of microphones. A general framework for acoustic source localisation using particle filtering is proposed, and four different al...

متن کامل

Importance Sampling Particle Filter for Robust Acoustic Source Localisation and Tracking in Reverberant Environments

The concept of acoustic source localisation and tracking (ASLT) plays an important role in many practical speech acquisition systems. Exact knowledge of the speaker position is usually the key to acquiring clean speech using e.g. beamforming or equalisation. Multipath sound propagation in practical environments however constitutes a major challenge to overcome for any array-based tracker. The p...

متن کامل

Integrating pitch and localisation cues at a speech fragment level

This paper proposes a novel speech-fragment based approach for processing binaural data to improve the estimation of speech source locations in reverberant, multi-speaker recordings. The technique employs two stages. First, a robust multipitch tracking algorithm is used to locate local spectro-temporal ‘speech fragments’ – regions where the energy in the mixture is dominated by a single speech ...

متن کامل

Binaural sound source localisation and tracking using a dynamic spherical head model

This paper introduces a binaural model for the localisation and tracking of a moving sound source’s azimuth in the horizontal plane. The model uses a nonlinear state space representation of the sound source dynamics including the current position of the listener’s head. The state is estimated via an unscented Kalman Filter by comparing the interaural level and time differences of the binaural s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Superposed speech localisation using frequency tracking

نویسندگان

چکیده

منابع مشابه

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

Particle Filtering Methods for Acoustic Source Localisation and Tracking

Importance Sampling Particle Filter for Robust Acoustic Source Localisation and Tracking in Reverberant Environments

Integrating pitch and localisation cues at a speech fragment level

Binaural sound source localisation and tracking using a dynamic spherical head model

عنوان ژورنال:

اشتراک گذاری